Joint Semi-supervised Similarity Learning for Linear Classification
نویسندگان
چکیده
The importance of metrics in machine learning has attracted a growing interest for distance and similarity learning. We study here this problem in the situation where few labeled data (and potentially few unlabeled data as well) is available, a situation that arises in several practical contexts. We also provide a complete theoretical analysis of the proposed approach. It is indeed worth noting that the metric learning research field lacks theoretical guarantees that can be expected on the generalization capacity of the classifier associated to a learned metric. The theoretical framework of ( , γ, τ)-good similarity functions [1] has been one of the first attempts to draw a link between the properties of a similarity function and those of a linear classifier making use of it. In this paper, we extend this theory to a method where the metric and the separator are jointly learned in a semi-supervised way, setting that has not been explored before, and provide a theoretical analysis of this joint learning via Rademacher complexity. Experiments performed on standard datasets show the benefits of our approach over state-of-theart methods.
منابع مشابه
Composite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملSemi-supervised Learning with Regularized Laplacian
We study a semi-supervised learning method based on the similarity graph and Regularized Laplacian. We give convenient optimization formulation of the Regularized Laplacian method and establish its various properties. In particular, we show that the kernel of the method can be interpreted in terms of discrete and continuous time random walks and possesses several important properties of proximi...
متن کاملKernel-based transition probability toward similarity measure for semi-supervised learning
For improving the classification performance on the cheap, it is necessary to exploit both labeled and unlabeled samples by applying semi-supervised learning methods, most of which are built upon the pairwise similarities between the samples. While the similarities have so far been formulated in a heuristic manner such as by k-NN, we propose methods to construct similarities from the probabilis...
متن کاملLocally Linear Metric Adaptation with Application to Image Retrieval
Many supervised and unsupervised learning algorithms are very sensitive to the choice of an appropriate distance metric. While classification tasks can make use of class label information for metric learning, such information is generally unavailable in conventional clustering tasks. Some recent research sought to address a variant of the conventional clustering problem called semi-supervised c...
متن کاملFast semi-supervised SVM classifiers using a priori metric information
This paper describes a support vector machine-based (SVM) parametric optimization method for semi-supervised classification, called LIAM (for LInear hyperplane classifier with A-priori Metric information). Our method takes advantage of similarity information to leverage the unlabeled data in training SVMs. In addition to the smoothness constraints in existing semi-supervised methods, LIAM incor...
متن کامل